Skip to content

Conversation

DerGut
Copy link
Contributor

@DerGut DerGut commented Sep 22, 2025

Which issue does this PR close?

Closes #1702

What changes are included in this PR?

  • a dependency upgrade from Datafusion 48 to 49
  • a dependency specificity change from an exact pinned patch version to a broader major version (IMO this is better for broader compatibility)
  • fixes to breaking APIs

Are these changes tested?

@DerGut DerGut changed the title Update Datafusion to v49 feat: Update Datafusion to v49 Sep 22, 2025
@DerGut DerGut mentioned this pull request Sep 23, 2025
@DerGut DerGut marked this pull request as ready for review September 23, 2025 20:43
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for working on the upgrade!

Datafusion 49 introduces a breaking change and is not backwards compatible with older versions.
More context: #1647 (comment)
This means pyiceberg-core will only work with datafusion 49 or newer.

I think we should add merge this into the next release 0.8.0 so that we can align with pyiceberg-core and pyiceberg


[tool.hatch.envs.dev]
dependencies = ["maturin>=1.0,<2.0", "pytest>=8.3.2", "datafusion==45.*", "pyiceberg[sql-sqlite,pyarrow]>=0.10.0", "fastavro>=1.11.1"]
dependencies = ["maturin>=1.0,<2.0", "pytest>=8.3.2", "datafusion==49.*", "pyiceberg[sql-sqlite,pyarrow]>=0.10.0", "fastavro>=1.11.1"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one major side effect is that upgrading to version 49 will make this backwards incompatible. Only other datafusion 49 can work with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the painpoint of doing major version upgrading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what i heard, 49->50 is also backwards incompatible

@kevinjqliu kevinjqliu added this to the 0.8.0 release milestone Sep 24, 2025
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to prevent accidentally merging since we're still working on the 0.7.x release :)

DerGut added 2 commits October 6, 2025 12:18
Update datafusion-ffi

Regenerate Cargo.lock

Update to arrow 55.2

Update to parquet 55.2

Update rust_decimal to 1.37.2

Regenerate Cargo.lock

Upgrade datafusion-python to 49
Fix test conflicts
@DerGut DerGut force-pushed the update-datafusion branch 2 times, most recently from 91d7631 to 5046eb8 Compare October 6, 2025 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Datafusion to 49

3 participants